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I.   INTRODUCTION 

It  is  well  known  that  many  problems  in  physical  sciences  give 
rise  to  the  algebraic  eigenvalue  problem  Ax  =  Ax  where  A  is  usually  a 
large  symmetric  matrix.   The  most  effective  way  to  solve  such  problems 
consists  of  reducing  A  to  the  tridiagonal  form  [1,  13]  and  using  either  the 
bisection  method  or  the  QL  algorithm  to  find  few  or  all  the  eigenvalues. 
The  major  goal  of  this  work  is  to  study  and  analyze  both  algorithms  on  a 
parallel  machine,  namely  the  ILLIAC  IV,  and  find  out  the  cases  in  which 
each  algorithm  should  be  used. 

The  QL  algorithm  is  an  effective  and  reliable  algorithm  for  finding 
all  or  most  of  the  eigenvalues  on  a  serial  machine.   However,  the  implementa- 
tion of  this  algorithm  on  a  parallel  machine  is  very  ineffective  since  only 
two  or  three  PE's  can  be  used  at  the  most.   On  the  other  hand,  the  bisection 
method  is  usually  preferred  for  finding  only  few  of  the  eigenvalues  since  the 
algorithm  is  especially  designed  to  isolate  the  individual  eigenvalues. 
The  bisection  method  therefore  can  be  easily  generalized  for  implementation 
on  a  parallel  machine.   We  shall  describe  this  modified  algorithm  which  we 
will  call  Multisection  method,  and  show  how  it  is  implemented  on  the  ILLIAC  IV. 
We  will  also  give  comparisons  of  time  estimates,  for  finding  all  or  few  of 
the  eigenvalues  for  some  tridiagonal  matrices,  between  this  parallel  algorithm 
and  the  serial  QL  algorithm. 

Finally,  we  will  provide  in  an  APPENDIX  the  program  listing  written 
in  an  ILLIAC  IV  language,  GLYPNIR. 


II.   THE  MULTISECTION  METHOD 


1.   THEORETICAL  BACKGROUND 


(1  )   Sturm  Sequences 

It  is  usually  desired  to  determine  lower  and  upper  bounds 

for  the  real  roots  of  a  polynomial  so  that  errors  in  approximating 

them  can  be  easily  estimated.   This  can  be  done  if  we  are  able  to 

determine  the  number  of  roots  of  a  polynomial  within  an  interval, 

this  in  fact  can  be  achieved  using  the  property  of  the  Sturm 

sequences . 

The  Sturm  sequence  is  defined  as  follows  [10]: 

DEFINITION.   Let  f  (x),  f.  (x),  ...,  f  (x)  be  a  sequence  of 

o      1  m  u 

continuous  functions.   Such  a  sequence  is  called  a  Sturm  sequence 
on  an  interval  [a,  b],  where  either  a  or  b  may  be  infinite,  if 
(l)   f  (x)  has  at  most  simple  roots  in  [a,  b]; 


o 


(2)  f  (x)  does  not  vanish  in  [a,  bl; 

m  '    ' 

(3)  if  f  (a)  =  0,  then  f    (a)  f,  .  (a )  <  0 

-K.  K— X         K+l 

for  any  root  ae[a,  b]; 

(k)      if  fQ(a)  =  0,  then  f  (a)  f  (a)  >  0 

for  any   root  ae[a,   b]. 

For  every   such   sequence,   there  exists  the   following  property. 

THEOREM  I.       [10,   p.    126].      The  number  of  zeros  of  f    (x)    in    (a,  b)    is   equal 

o 

to  the  difference  between  the  number  of  sign  variations  in 

{f  (a),  f  (a),  ...,  f  (a)}  and  in  {f  (b ) ,  f_(b),  ...,  f  (b )}  provided  that 
o  ±  m  o      l  m 

f>0(a)fQ(b)  +   0  and  (f  (x),  ^(x),  ...,  f  (x)}  form  a  Sturm  sequence  on  [a,  b] 
The  result  described  above  is  very  useful  in  locating  the  zeros 


of  a  polynomial.      Now  we   discuss  how  it   can  be   applied  to   find  the   eigen- 
values , 

Let   A  be  a  symmetric  tri diagonal  matrix  of  order  n  with   d.    and 

e     as   diagonal   and  subdiagonal  elements   respectively, 
i 


A   = 


dl        \ 


61        d2        62 


e2        d3        e3 


en_2        dn-l        en-l 


e     .        d 
n-1  n 


(2.1) 


recalling  that  the  characteristic  polynomial  is  given  by 


p  (A)  =  det  (XI  -  A) 

n 


and  that  its  principal  minors  are  given  by  the  following  recurrence  relation 


P  (A)  =  1, 


Pl(A)  =  A  -  d1, 


p.(A)  -  (A  -  d.)  p±_±    (A)  -  e^  p._2  (A) 


(2.2) 


i  =  2,  3,  .  . .  ,  n. 


If  any  e.    =  0,   then  the   determination  of  the   eigenvalues   of  A  is   reduced  to 
that   of  determining  the  eigenvalues   of  two    smaller  matrices.      Hence  we 


assume   e.    ^  0   for  i  =  1,    ....   n  -  1.      We  then  have 
i 


THEOREM  II  [10,  p.  l68].   Let  the  real  symmetric  tridiagonal  matrix  A  be 

defined  by  (2.1),  with  all  e.  £   0.   Then  the  zeros  of  each  -n  (A), 

i  *i 

i  =  2,  3,  ...,  n,  are  distinct  and  are  separated  by  the  zeros  of  p.  ,  (A); 
and,  if  p  (A)  /  0,  the  number  of  eigenvalues  of  A  that  are  larger  than  A 
is  equal  to  the  number  of  sign  variations  in  the  sequence 

PnU),  Vi(A)'  •"»  P1(A)'  1* 

Since  the  eigenvalues  of  a  matrix  are  but  the  zeros  of  the 

characteristic  polynomial  of  that  matrix,  and  the  principal  minors  of 

AI  -  A  form  a  Sturm  sequence,  then  the  theorems  described  in  this  section 

can  be  used  as  the  basis  of  an  algorithm  for  computing  the  eigenvalues  of  A. 

(2)   The  Bisection  Method 

Suppose  the  eigenvalues  of  A  are  ordered  so  that 

A   >  L  >  ...  >  A  ,  if  ve  have  two  real  values  a  and  b   such  that 
1    2         n>  o      o 

bQ  >  aQ,      V(aQ)  <  k,       V(bQ)  >_  k,  (2.3) 

where  V(x)  is  the  number  of  sign  variations  in  the  sequence 

{p  (x),  p  (x),  ...,  p  (x)},  then  from  the  results  in  the  last  section  we  know 

that  there  exists  at  least  an  eigenvalue  A.  in  the  interval  (a  ,  b  ).   The 

k  o   o 

value  of  A   can  be  determined  as  accurately  as  possible  by  an  iterative  process 

called  the  "Bisection  Method."  The  algorithm  is  as  follows: 

Start  from  the  interval  (a  ,  b  ),  suppose  tnat  in  (i  -  l)  steps 

we  have  established  an  interval  (a.  n.  b.  n  )  such  that 

i-l'      l-l 

V(a.    .)    <  k,        V(b.    -  )  >  k,        b.    _    -  a.   _    =   (b     -  a   )/21_1        (2.M 
i-l  i-l     —  i-l  i-l  o  o 

In  the   i-th   step  we   compute  the  mid-point   c.    of   (a.       ,   b.       ),    i.e., 

ci  4  (vi +  Vi'-  (2-5) 


and  evaluate  the  sequence 

P0(c.),    P1(ci),    ...,   Pn(c.)  (2.6) 

to   determine  V(c).      Then> 

if  V(c. )   >  k,   we  take  b.   =  c. ,    a.   »  a.    n :  (2.7) 

i     —  ill  i-I 

if  V(c.)  <  k,  we  take  a.  =  c,  b.  =  b    .  (2.8) 

In  either  case,  we  have 

(bi  "V  =  I  (bi-l  -  Vl'  (2-3' 

and 

V(b.)  >_k;   V(a.)  <  k  (2.10) 

so  that   A     is   always   in  the  interval    (a.,  b. )   and  we   can  locate   it   in  an 

■K-  11 

interval  of  width  (b  -  a  )/2   in  p  steps  of  the  bisection  process. 

o    o 

With  exact  computation,  this  method  can  be  used  to  determine  any 
eigenvalue  to  a  prescribed  accuracy  without  referring  to  other  eigenvalues. 
The  process  converges  with  an ' asymptotic  convergence  factor  of  1/2  [10,  p.  128] 
It  is  clearly  not  a  rapid  convergence  but  the  algorithm  is  quite  effective. 

The  major  part  of  the  process  is  the  calculation  of  the  sequence 
(2.6)  using  formula  (2.2).   There  are  roughly  2n  multiplications  and  2n 

additions  for  each  evaluation  of  the  sequence.   If  all  the  n  eigenvalues  are 

2 
determined  in  t  bisection  steps,  then  essentially  2n  t  multiplications  are 

2 
required  (n  is  large).   Remember  that  e.  ,  i  =  1,  2,  . . . ,  n  -  2,  are  computed 

once  and  for  all,  this  contributes  (n  -  l)  multiplications  to  the  total 

number  of  operations. 

The  situation  is  slightly  complicated  when  there  are  a  number  of 


6 

very  close  eigenvalues,  since  p  (A)  =  g   (A  -  A.)  is  very  small  for  any  A 
which  is  in  the  neighborhood  of  these  eigenvalues,  the  zeros  of  each  p.(A) 


i 
separate  those  of  p   (A)  and  accordingly  many  of  the  p.(A)  may  also  become 

very  small  and  give  rise  to  underflow.   Also,  if  some  of  the  eigenvalues 

are  very  large,  then  overflow  may  cause  troubles  during  the  evaluation  of 

the  Sturm  sequence. 

This  difficulty  may  be  avoided  by  a  simple  modification  [6],   The 

sequence  of  p. (A)  is  replaced  by  a  sequence  of  q.(A)  defined  by 

q1(A)  =  Pi(A)/p._i(A),    i  =  1,  2,  ...,  n  .  (2.1l) 

V(A)    is  now  given  by  the  number  of  negative  q.(x).      The   q.(A)    satisfies   the 
relations 

qx(X)    =    A   -   d1, 

q.(X)   =    (X   -   d.)    -   e.2./a.    .     (X)        i   =  2,    ...,   n    .  (2.12) 

1  1  1—  J.       1—1 

In  case  q.   (X)  is  zero  for  some  i,  we  just  replace  the  zero  q.   (x)  by  a 
suitable  small  number  and  the  error  analysis  for  the  p.(X)  sequence  applies 
almost  unaltered  to  the  q.(X)  sequence  [6]. 

Comparing  the  computation  of  the  sequences  q^X)  and  p^X),  we 
find  that  two  multiplications  have  been  replaced  by  one  division.   Further- 
more, we  only  have  to  detect  the  sign  of  q.(X)  instead  of  those  of  p.  n(X) 
and  p.(X).   For  serial  computers  in  which  the  execution  of  a  multiplication 

or  a  division  uses  almost  the  same  time,  the  replacement  of  p.(X)  by  q.(x) 
causes  an  improvement  in  the  execution  time.   But,  for  the  parallel  com- 
puter ILLIAC  IV,  the  execution  time  for  a  division  is  about  six  times  that 
of  a  multiplication,  so  usually  the  p. (A)  sequence  is  evaluated  except 
for  some  special  cases  which  will  be  discussed  later. 


The  major  process   in  the  bisection  method  is  the   calculation  of 
the  Sturm  sequence  p. (x)    (or  q.  (X)).      Once  the   eigenvalues  are   separated, 
we  may  use  any  algorithm  for  the  calculation  of  a   zero  of  a  real   function 
to   find  the  eigenvalue  within  each   interval.      The  bisection  method  can  still 
be  used  to  calculate  the  eigenvalues   up  to  some  specified  accuracy  after  we 
separate  all   of  them.      But,    it   converges   only  linearly.      Hence  some  other 
algorithms  with  higher  convergence  rate  must  be  considered.      In  this  paper 
we  select   the  Zero-in  algorithm  [ll]  which   is   due  to  Wijngaarden  et.    al.    to 
calculate  the  eigenvalues  once  we  separate  them.      This   algorith,   a  method 
of  order    (/5~+  l)/2    [10,   pp.    100-101  ]    is   described  in  APPENDIX  I. 


2.      IMPLEMENTATION 

The  multisection  method  contains   two  major  stages.      In  the   first 
stage  we   calculate  the  Sturm  sequence  p. (A)    (or  q.(A))   and  find  all  the 
intervals  each  of  which   contains   only  one   eigenvalue.      In  the   second  stage 
we   use  the   zero-in  algorithm  to   simultaneously  determine   6h   eigenvalues 
at   a  time.      We  shall   consider  the   implementation  of  this  method  on  a  N  =  l6  PE 
machine   for  a  symmetric   tridiagonal  matrix  of  order  n  =  N  =  16. 

Figure  1   shows  the  storage  allocation   scheme   for  the   first   stage. 
We  store  the   diagonal   and  sub diagonal   elements  of  A  across   the  PEs .      This 
kind  of  storage  allocation   is   very   convenient    for  the  parallel   operation 
which  will  he   discussed  later.      The  quantities    e.    (i  =  0,   1,    ...,   n-1 )  will 
be  used  frequently  when  calculating  the  Sturm  sequence,    so  we   compute  them 
and  store  them  into  row  P  for  later  use. 

If  all  the  eigenvalues    are   required,   we  use    (-     A        ,        A        ) 
as   the   initial    interval,   where  A  =  max  g. .      The   g. 's    can  be  obtained 

'  '  '   '  '  oo  "j.  X 

i 
and  stored  into  row  A  by  one  statement,  that  is, 

row  A  =  row  D  +  row  E  +  (row  E  route  right  one  PE) .         (2.13) 

Then,  one  more  instruction  is  required  to  pick  the  maximum  element  in 

i  i  ii  2 

row  A  as  the  value  of  | |A| |  .   All  the  quantities  e.  are  also  computed  and 

stored  into  row  P  by  one  instruction, 

row  P  =  row  E  x  row  E.  (2.1*0 

If  the  dimension  of  the  matrix  is  not  greater  than  N,  we  then  always  have 
this  simple  situation. 
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The  major  process   in  the   first   stage   is   to   calculate  V(y    ). 

i 

The  storage  allocation  scheme  for  this  process  is  shown  in  Fig.  2.   For  the 

given  interval  (a,  b),  assume  that  the  value  V(h)  is  known,  we  divide  (a,  t>) 

into  N  =  16  sub intervals.   Let  the  left  end  point  of  the  i-th  interval  be 

y. ,  we  then  have 
1 

y.  =  a  +  (i  -  1)  5-Jp.  m  (2>15) 

All  the  y. 's  are  stored  into  row  T  by  using 


1 


h  =  (b  -  a)/N  (2.i6) 


and 


row  T  =  a  +  PEN  x  h,  (2.17) 

where  PEN  is  an  integer  whose  value  is  j  in  PE  .   Usually,  (2.2)  is 

J 

used  to  evaluate  the  sequence  p.  (A)  because  it  requires  less  computation  time, 
on  the  ILLIAC  IV.   But  if  there   is  any  subdiagonal  element  which  is  relatively 
small  with  respect  to  the  rest  of  the  elements  of  A,  then  (2.12)  is  used. 
The  process  is  itself  sequential,  but  all  N  V(y.)'s  can  be  calculated  simul- 
taneously since  each  PE  contains  one  y.  and  all  the  PEs  are  kept  busy.   Once 
the  V(y.)'s  are  computed,  they  are  stored  into  row  V  (see  Fig.  2). 

The  number  of  eigenvalues  contained  in  the  intervals 
(y1$  y2),  ...,  (y-jt.  l^g)  can  be  found  by 

row  M  =  (row  V  shift  left  one  PE)  -  row  V.  (2.18) 

V(b)  -  V(yw)  gives  the  number  of  eigenvalues  contained  in  the  Nth  interval 
(y  ,  b)  where  V(b)  is  known. 
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ROW   ELf   AND   EU.:       LOWER  AND   UPPER   BOUNDS  OF  THE   SUBINTERVAL  WHICH   CONTAINS   EXACTLY 

ONE  EIGENVALUE 
ROW   Et   AND   Ui:       LOWER  AND   UPPER   BOUNDS   OF  THE   SUBINTERVAL  WHICH    CONTAINS    MORE   THAN 
ONE   EIGENVALUE 


FIGURE  2.   STORAGE  ALLOCATION  SCHEME  AT  THE  END  OF  THE  FIRST  STAGE 
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Once  row  M  is  obtained,  its  contents  are  examined.   For  these 
PEs  with  M  equal  to  0,  nothing  is  done.   For  those  which  have  M  equal  to 
1,  the  y.'s  are  stored  into  the  proper  position  in  row  EL.  (the  value  of  j 


may 


be  different  in  different  PE's)  as  the  lower  bounds  of  the  intervals 


that  contain  only  one  eigenvalue.   The  values  of  y.  +  h  are  stored  into  the 

corresponding  position  in  row  EU .  as  the  upper  bounds  (see  Fig.  2).   The 

J 

subintervals  which   contain  more  than   one   eigenvalue  are  treated  similarly 
where   y.    and  y.    +  h  are   stored  in  row  L.    and  U,    respectively. 

The  process   described  above   can  be  performed  simultaneously   for 
each  group  of  PE's  with  the  same  value  of  V..      The  shaded  area  in  the   figure 
means   that   some  y.   or  y.    +  h  are   stored  there. 

We   repeat  the  above  mentioned  process   for  each   subinterval  which 
contains  more  than  one   eigenvalue    (i.e.,    the   subintervals    stored  in  the 
area  between  row  L     and  row  U     in  Fig.    2)   until   no  more   subintervals   con- 
tain  more  than  one   eigenvalue.      This   can  be  achieved,    at  least   in  principle, 
since   all  the  eigenvalues   are   distinct. 

The  storage   allocation  scheme   for  the   implementation  of  the 

second  stage,    namely  the   Zero-in  algorithm   (App.    I),    is   shown  in  Fig.    3. 

Since  the   diagonal   and  subdiagonal   elements   of  the  matrix  A  are  referred 

to  very  often   in   each  PE  during  the  process,   they  are   duplicated  into   each 

PE  at  the  beginning  of  this   stage    (from  rwo  D     to  row  E        ).      This    is 

feasible   since  we  assume  that   the  order  of  the  matrix  is   equal   to  the 

number  of  PE's   and  6U<<20U8  which   is   the  number  of  rows   in  the  PE  memory. 

The  lower  bounds   and  upper  bounds   of  the   subintervals  that   contain  only 

one   eigenvalue   are   stored  in  rows   B  and  C   respectively  as   initial   contents. 

The  b.    and  c.    are   interchanged  if  necessary  to   satisfy  the   condition 
ii 
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PE 

0 

1 

2 

3 

4 

5 

6 

7 

8 

9 

10 

11 

12 

13 

14 

15 

*  ROW  A 

•l 

a2 

a3 

•a 

a5 

a6 

a7 

a8 

a9 

aio 

all 

a12 

a13 

a14 

a15 

a16 

*  ROW  B 

bl 

b2 

b3 

b4 

b5 

b6 

b7 

b8 

b9 

b10 

bll 

b12 

b13 

bu 

bl? 

b16 

*  ROW  C 

Cl 

c2 

c3 

c4 

c5 

c6 

c7 

c8 

C9 

C10 

cll 

C12 

ci3 

c14 

C15 

C16 

*  ROW  D. 

dl 

dl 

dl 

dl 

dl 

dl 

dl 

di 

dl 

dl 

dl 

dl 

dl 

di 

dl 

dl 

*  ROW  D2 

d2 

d2 

d2 

d2 

d2 

d2 

d2 

d2 

d2 

d2 

d2 

d2 

d2 

d2 

d2 

d2 

• 

• 

• 

• 

• 

• 

• 

• 

• 

• 

• 

• 

• 

• 

• 

• 

• 

ROW  D 
n 

d 
n 

d 
n 

d 
n 

d 
n 

d 
n 

d 
n 

d 
n 

d 
n 

d 
n 

d 
n 

d 

n 

d 

n 

d 
n 

d 

n 

d 
n 

d 
n 

ROW  E 

el 

el 

el 

Cl 

Cl 

el 

Cl 

el 

el 

el 

el 

61 

61 

Cl 

Cl 

el 

ROW  E 

e2 

e2 

e2 

e2 

e2 

e2 

e2 

e2 

e2 

e2 

e2 

e2 

e2 

e2 

e2 

e2 

• 

• 

• 

• 

• 

• 

• 

• 

• 

• 

• 

• 

• 

• 

• 

• 

• 

ROW  E   , 
n-1 

Vi 

Vl 

Vl 

Vi 

Vl 

en-l 

Vi 

en-l 

Vi 

en-l 

en-l 

Vi 

en-l 

Vi 

Cn-] 

Vi 

*  ROW  PA 

pal 

pa2 

pa  3 

pa4 

pa5 

pa6 

pa7 

pag 

Pa9 

pa10 

Pail 

pa12 

Pal: 

P*14 

Pal' 

pa16 

*  ROW  PB 

pbl 

Pb2 

Pb3 

Pb4 

Pb5 

Pb6 

pb7 

pb8 

Pb? 

pbio 

pbu 

pb12 

pbl; 

Pb14 

Pbl' 

Pblf, 

*  ROW  PC 

pcl 

pc2 

pc3 

pc4 

pc5 

PC6 

pc7 

PCg 

pc9 

pc10 

PC11 

pc12 

PC13 

pc14 

PC1( 

PC16 

*  ROW  INT 

pl 

p2 

p3 

P4 

p5 

P6 

p7 

P8 

P9 

p10 

Pll 

p12 

P13 

P14 

p15 

P16 

*  ROW  MID 

ml 

m2 

m3 

mA 

m5 

m6 

m7 

m8 

m9 

mio 

mu 

m12 

m13 

mi4 

m15 

mi6 

(s), 


ROW   PA:         VALUES  OF  p    (a    v    ' ) ,    DENOTED  BY    pa    . 


(s). 


ROW  PB:    VALUES  OF  p  (b     ),  DENOTED  BY  pb 


ROW  PC:    VALUES  OF  p  (c  (s)),  DENOTED  BY  pc,. 
n  1  r  i 

ROW  INT:   VALUES  OF  LINEAR  INTERPOLATION  BETWEEN  a  (s)  AND  b  (s) . 
ROW  MID:   VALUES  OF  MIDPOINT  OF  INTERVALS  (b  (s),  c  (s)). 

*  ALL  THE  ENTRIES  SHOWN  IN  THESE  ROWS  ARE  THE  VALUES  AT  sTH  ITERATION,  THE 
SUPERSCRIPT  "s"  IS  OMITTED. 


FIGURE  3.   STORAGE  ALLOCATION  SCHEME  FOR  THE  SECOND 
STAGE— ZERO- IN  ALGORITHM 


Ik 

|p  (b.)|  <  |p  (c.)|.   The  entries  a.'s  in  row  A  are  then  chosen  to  be  the 

'*n     1    '   —    '    n     1    '  i 

same   as    c.      The  values  of  p    (a.),    p    (b.)   and  p   (c.)   are   computed  simul- 
i  n     i  n      l  n     l 

taneously   for   i  =  1,    .  ..,    N  and  stored  into   rows   PA,    PB  and  PC.      The   in- 
terpolations  p.    of  a.    and  b.    are   stored  in  row  INT.      The  midpoints  m.    of 
ill  r  i 

b.    and  c.    are   stored  in  row  MID.      The  new  values  of  b.    in  row  B  are  taken 
ii  l 

to  be  p.    or  m.    depending  on  whether  p.    is  between  m.    and  b..      This   can  be 
ii  l  ii 

done   simply  by  using  an   IF   statement.      The  new  contents   of  row  A  and  row  C 
are  then   changed  as   described  in  APPENDIX   I.      We  then   start  the  next   iter- 
ation.     When  the   stopping  criteria  are   satisfied  in  some   PEs,    these  PEs 
are   disabled.      But   for  those  PEs  which   are   still  enabled,    the  process   con- 
tinues  until  the   stopping  criteria  are   satisfied  in  each   PE. 

If  we  have  one   interval   that    contains   some  very  close  eigenvalues 
then  the  number  of  sturm  sequence   evaluations  may  be  large,    this   coupled 
with  the  time  required  for  storing  and  fetching  various   subintervals   for 
the  PE  memory   can  prove  to  be   a  time   consuming  process. 
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3.      TIME  ESTIMATES 

The  time  required  to   find  all  the  eigenvalues   for  various  matrices 
with   different    dimensions    is   discussed  in  this   section.      We  will   discuss 
first  the  simple  case  n  <_  N,  where  n   is  the  dimension  of  the  given  matrix 
and  N  is   the  number  of  PEs   in  the  parallel   computer. 

Two   classes   of  operations  will  he  discussed,    data  manipulations 
and  arithmetic   operations.      The   clock  time  taken  "by   each  of  the   four  basic 
arithmetic   operations    in  the   ILLIAC   IV   is    shown  in  TABLE  I,   part  of 
TABLE  6-2   in    [15].      We  assume  that  we  use  normalized  floating-point  numbers 
rounded  in  6i+-bit  mode. 

TABLE  I.      Clock  Time   for  Each  Arithmetic   Operation  on   ILLIAC   IV 


Operations 

Clock  Time* 

+    (ADRN ) 

(SBRN) 

x     (MLRN) 

r     (DVRN) 

6 

7 

9 

56 

-9 
*  1   clock  time  #  60   x  10        seconds 

In  what   follows  we  will   assume  that  one   addition  or  subtraction   is 
equivalent  to   one  multiplication,    and  a  division   is    equivalent  to   6  multi- 
plications. 


(l)      Computation   Time    (CT) 

The   computation  process   in  the  multisection  method  can  be  divided 
into  four  major  parts: 
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(a)  calculate     ll-A-ll^j 

(b  )  calculate  the  square  of  each   sub  diagonal   element, 

(c)  evaluate  the  Sturm  sequence   (p.(A)}  or   {q.(A)},    and 

(d)  execute  the   zero-in  algorithm. 

The  norm    |  |a| | m  is   used   in  deciding  upon  the  stopping  criterion. 
Besides,    if  all   the   eigenvalues   are  required,   we   can  use    (-||a|      ,    ||a|       ) 

'  ^'  x  *  I     I  I     I    oo*  II  II    oo 

as   the   initial   interval.      To    find  the  norm,    (2.13)    is   used.      This   part 
requires   2   additions   and  is   executed  only  once. 

For  part    (b),   we  use    (2.lU).      The  square  of  e.    (i   =  1,    ...,   n  -  l) 
is    calculated  once   and  for   all,    so  only  one  multiplication   is   required. 

For  the  evaluation  of  the  Sturm  sequence  we   assume  that    (2.2) 
is   used,   this   requires   2n  multiplications    and  2n   additions.      Besides,   to 
find  the  number  of  eigenvalues    contained  in   each   interval   requires   1 
subtraction. 

The   zero-in  algorithm  is   an   iterative  process.      Many   iterations 
may  he   required  before  we  obtain  an   eigenvalue.      During  each   iteration, 
(2.2)    is   used  to   evaluate  the  value  of  the  characteristic   polynomial. 
Then, 

INT.    =   [b.    pn    (a.)   -   a.    Pn    (*>.)]/[Pn    (aj_)    -  Pn    (b.)]  (2.19) 

is   used  for   interpolation,   hence  2(n  +  l)   multiplications,    2(n   +  l)    sub- 
tractions  and  1   division   are  required. 

Suppose  p  is  the  number  of  times  we   repeat  part    (c)   to   separate 
all  the   eigenvalues    and  q  is   the  number  of  required  iterations    in  the  Zero- 
in   algorithm.      Then  the  total  number  of  arithmetic  operations   used  to  obtain 


IT 
all  the  eigenvalues    is    approximately 

6q  +   Un    (p  +   q)  (2.20) 

equivalent  multiplications,  and  the  total  time,  in  microseconds,  for  the 
arithmetic  operations  is  roughly 

TA  =  ^  q  +  2n  (p  +  q).  (2.2l) 

If  we  assume  that  p  and  q  remain  constant  as  n  varies,  then  (2.21 ) 
represents  a  linear  relation  between  the  total  CT  and  the  dimension  of  the 
matrix,  see  Fig.  h. 

While  for  n  >  N  where  we  assume  here  that  n  <_  1021+ ,  part  (a) 
requires  2«[— ]  additions,  and  part  (b )  used  [— ]  multiplications,  where  [^-] 
is  the  smallest  integer  no  less  than  — .   The  number  of  the  required  opera- 
tions in  parts  (c)  and  (d)  remains  the  same.   Thus,  from  (2.21),  the  total 
CT,  in  microseconds,  for  the  general  case  is  approximately 

*A  =  I   [N]  +  2  q  +  2n  (p  +  q)>  (2*22) 

see  Fig.  5. 

(2)      Data  Manipulation   Time    (DMT) 

Most   of  the   data  manipulation  occurs    during  the   following  processes 

(a)  Dividing  the   given   interval   into   N  subintervals   and  distributing 
the  left  terminals   into   all   PEs. 

(b )  To   evaluate   the   Sturm  sequence   in  all   PEs   simultaneously,  we  have 

2 
to   grab   d.    and  e.    one  by  one   from  the  memory  because  we   store  them  across 

all   PEs.      This  process   takes   2n  times  of  the   function:      GRABONE. 
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2q 


1    Unit: 

ys 

> 

H 
O 

|q  +  2N(p+q) 

rt 

M 

n 

H 

2(p+q) 

^"" 

n:      Dim.    of  Matrix 

1 

■            i 

RELATION:       TA  =  |-  q   +   2(p  +   q  )n 

p:      #   of  times  of  Sturm  sequence  evaluations 

q:       #   of  required   iterations    in   Zero-in   algorithm 

N:      #  of  PEs    in  parallel   computer 


FIGURE   U.       TOTAL  COMPUTATION  TIME  FOR  n   <   N 


19 


jr 

W 

4-1 

s 

2: 

•H 

H 

CN 

Cfl 

V4 

o 

EH 

d 

oc 

K 

o 

rH 

O 

•H 

CO 

H 

4-1 

Eh 

CO 

C 

< 

3 

■H 

Eh 

c 

■— l 

1 

5P 

,^N 

Cfl 

O 

K 

cr 

> 

u 

s 

QJ 

QJ 

o 

+ 

qj 

N 

l-i 

QJ 

o 

a. 

CJ 

c 

4J 

1-3 

,n^' 

c 

•H 

3 

< 

CM 

QJ 

(X 

Eh 

a 

-a 

e 

o 

+ 

cr 

eu 

o 

Eh 

01 

^ 

CJ 

r-|e\i 

U) 

3 

r— 1 

B 

O" 

QJ 

LP\ 

+ 

>-i 

QJ 

iH 

3 

k-l 

— i 

W 

c~|z 

4-1 

ca 

cfl 

g 

z      — 

c 

CO 

o 

m|<r 

14-1 

o 

a 

(-H 

o 

•H 

(in 

ii 

4-1 

c 

w 

CO 

•H 

< 

qj 

u 

4-1 

e 

QJ 

C/J 

•H 

4-1 

w 

4J 

•r-t 

p- 

Z 

14-1 

14-1 

14-1 

O 

o 

o 

O 

i— I 

H 

=tt= 

=%: 

=%; 

3 

w 

aj 

D- 

cr 

Z 

20 

(c)  As  mentioned  above,   the  Sturm  sequence  may  "be  evaluated  more 
than  once  to   separate  all  the   eigenvalues.      Except    for  the   first    evaluation, 
we  have  to  pick  up  the   interval  which   contains  more  than  one  eigenvalue   from 
the  PE  memory   shown   in  Fig.    2.      Since  the  intervals   are   stored  at   random, 
i.e.,   not   every  PE  contains   such  kind  of  interval,  we  must   test   all  the 

PEs   and  enable  those  PEs  which   contain  such   intervals,   then  pick  up  the 
interval   contained  in  the  PE  with  maximum  PEN.      This   requires   one  execution 
of  the  MAX   function. 

(d)  Before  executing  the   zero-in  algorithm,  we  must   distribute  the 
diagonal   elements   and  the  squares  of  the   subdiagonal   elements   into   all   PEs. 
This    requires   2n   executions  of  GRABONE  function. 

Let  p  and  q  have  the  same  meaning  as  before,  then  the  data  mani- 
pulation requires  roughly  p  calls  of  the  function  MAX  and  2np  calls  of  the 
function  GRABONE. 

The   execution  time   for  the   functions  MAX  and  GRABONE  can  be 
calculated  according  to  GLYPNIR  PROGRAMMING  MANUAL   and  TABLES  G-l ,    G-2   in 
[15].      The   function  MAX  requires    62  clocks  while  the   function  GRABONE 
requires  15   clocks.      Thus  the  time,    in  microseconds,    required  for  data 
manipulation  is    approximately 

TD  =  tD  =   2pn.  (2.23) 

By   adding  the  t      and  t    ,  we  obtain  the  total    execution  time, 


tT=f [|]   +|q+   2n    (q  +   2p).  (2.2* 

Fig.    6   shows  the  total   execution  time  for  the  multisection  method  to   find 
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all  the  eigenvalues  with  respect  to  the  dimension  of  the  given  matrix  for 

some  sets  of  values  p  and  q. 

Let  n  he  the  dimension  of  matrix  A,  (-AM  ,  I  |a|   )  he  the 

»         II     II  oo>     II     I  I  00 

initial  interval  which  contains  all  the  eigenvalues  of  A,  and  t  he  the 

B 

number  of  bisections  on  a  serial  machine  to  obtain  an  eigenvalue,  then 

2     I  A  !   <  e.  (2.25) 

■  ■    I  ■  00   — 

Using  the  multisection  method  on  a  parallel  machine  to  separate 

the  same  eigenvalue  we  require,  say,  t   steps,  then 

t 
2|  | A I   /N   <  e.  (2.26) 

11     I  I  00  — 

Solving  for  t      and  t      from   (2.25)   and    (2.26),  we   find  that  the  ratio 
tjt..  is   roughly   equal  to   £n  N/iln  2,    i.e.,    £og.N.      For  ILLIAC   IV,    N  =   61+, 
the   ration  t   /t      is   6.      This    is  not  much   of  an   improvement.      However,   when 
we  use  the  multisection  method  to   separate  all  the   eigenvalues  we  then  use 
either  the   Zero-in  algorithm  or  the  bisection  method  to   compute  6k   eigen- 
values  in  those   intervals   simultaneously.      Hence  this  method  is   at   most 
6'[n/6h]   times   as   fast   as   the   serial   computer  to   find  all  the   eigenvalues. 
For  a  matrix  with   dimension  n  =   U096,   the  multisection  method  is   at  most 
28U  times   faster. 
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h.      ERROR  ANALYSIS 

The  error  bounds   for  the  eigenvalues   of  a  symmetric  tri diagonal 
matrix  calculated  by  the  bisection  method  are  6x2        | |a|      ,    [7,    12],    for 
any  reasonable  round-off  procedure   in   floating-point  binary  arithmetic   if 
a  t-digit  machine   is   used.      Note  that  the  bound  is   independent  of  n,    the 
dimension  of  the  matrix. 

In  the   first   stage  of  the  multisection  method  all  the  eigen- 
values  are   separated  by  the  Sturm  sequence  process  which   is   substantially 
the   same  as   the  bisection  method  except  that  the   interval  which   contains 
the   eigenvalues   is   divided  by  N  instead  of  2.      Thus  the  numerical   accu- 
racy in  this   stage  should  not  be  worse  than  that    in  the  bisection  method. 
In   fact,    very  few   iterations   of  the  Sturm  sequence  processes   are   required 
to   separate  all  the   eigenvalues. 

The   Zero-in  algorithm  can  be  thought   of  as   a  bisection  method 
combined  with  linear   interpolation.      Usually,   the   interpolated  point   is 
used  as   a  new  end  point   of  the  new   interval.      But   if   it    is   outside  the 
old  interval,    the  midpoint   of  this    interval    is   used  as  a  new  end  point, 
this   is   in  fact   a  bisection  process.      Hence  we  have  the   same   error  bounds 
for  the  computed  eigenvalues,    6   x   2     '   x    | |a| |    .      We   see  that  the  relative 
error  may  be  quite  large   for   small   eigenvalues. 


2k 


III.   RESULTS  AND  DISCUSSION 

Consider  the  matrix  B, 

"  x    1 

1   -x   1 

lxl 

1   -x   1 


I 


1   x 


1   -x 


(3.1) 


nxn, 


where  x  is  real, 


1.      EIGENVALUES 

The   eigenvalues   are   computed  by  both   the  QL  algorithm  [3,   tq£l] 
and  the  multisection  method.      TABLE  II  gives    some  of  the   results   computed 
on   ILLIAC   IV  simulator    (B67OO).      The  tolerance   e   in  both   methods   is   set 
to  be  10~15. 

The   exact    eigenvalues  of  matrix   (3.1 )    can  be   computed  by  the 
following  formulae; 


r    2        ,  2  ■ ,     k ^1/2 

xk=  [x    +  k  cos    (^ry^ 


An+l-k  "   "   V 


k  =  1,    ...,    [n/2], 


A     n    =   0      if     n   is   odd. 
n+1 


(3.2) 
(3.3) 
(3.U) 
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TABLE  III  gives  the  values  of  X.  of  some  matrices  for  the  comparison 
with  TABLE  II.   The  A.'s  are  computed  by  (3.2)  (3.1+ )  on  the  IBM  360/75 
in  long  precision. 


TABLE  II.   Eigenvalues  Computed  on  ILLIAC  IV  Simulator 


For  x  =  1.0,  N  =  h,    the  eigenvalues: 


Multisection 

-1.902113032590307 

•1.17557050U58W7 

I.17557050U58U9U7 

1.902113032590300 


QL 


-1.902113032590^00  (0)** 

-1.17557050^585025  (1) 

1.17557050U58U961  (h) 

1.902113032590350  (2) 


For  x  =  1.0,  n  =  8,  the  eigenvalues: 


Multisection 

-2.128870331006081| 

-I.829561793253738 

-I.U1U213562363092 

-1.058590930637600 

1.058590930637600 

1.U1U213562373092 

1.829561793253738 

2.128870331006098 


QL 


-2.128870331006297  (0) 

-I.829561793253966  (l) 

-1.1*11+21356237333^  (2) 

-1.058590930637820  (h) 

1.058590930637628  (1+) 

l.UlU213562373106  (2) 

1.829561793253809  (3) 

2.128870331006169  (2) 


(Continued  on  next  page) 


*  The  underline   shows   the   digits  which   do   not  match  with   the  values   shown 
in  Table   III. 

**  The  numbers   in  parentheses   indicate  the  number  of  iterations  required 
to  obtain  that   eigenvalue  by  TQL1. 


26 


TABLE  II    (continued) 


For  x  =  10      ,    N  =  12,   the  eigenvalues 

Multisection 
-1.91+i883631+8T78Ut 
-1. 770912051 33U6U8 
-I.U97021U963 75602 
-I.l36l291+93506317 
-0. 70920977^15 5 57^+7 
-0. 21+1073360718051+6 

0.2lH0733607l80522 

0. 709209  77^15557^ 

1. 136129^9350632^ 

l.U97021^96375602_ 

I.7709l205l331+655p 

1.9 1+18836 3I+8778I+7 


QL 


-1.91+1883631+878216  (0) 

-1.770912051335017  (1) 

-I.U97021U96375986  (2) 

-1.136129^93506708  (2) 

-0.70920977^1559833      (2) 

-0.2U10733607183300      (2) 

0.2U10733607178379      (2) 

0.70920977^1555^99   (3) 

1.136l291+9350632j+       (h) 

1.1+97021 1+96375638       (3) 

1.  770912051 331+69p_       (3) 

1.9^188363^877868       (2) 


For   x  =  10      ,    N  =   2l+,    the   eigenvalues: 

Multisection 
-1.9  81+229^026  5^165 

-I.937166322283062 

-I.859552971803382 

-I.75261336OH6257 

-I.618033988780788 

-1. 1+5793725^877116 

-I.27I+8I+7979536602 


QL 


-1.981+2291+02651+982  (0) 

-I.9371663222838I+U  (l) 

-1.8595529  71 8Q1+177  (2) 

-I.75261336OH7OIO  (2) 

-1.6l803398878l669  (2) 

-I.I+5793725I+87805I+  (2) 

-1. 27^81+7979537^55  (2) 


(Continued  on  next   page) 
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TABLE  II    (continued) 

Multisection 
-I.O7165359OOOU651 
-0.8515585831888650 
-0.6180339888307955. 

-0. 37^762629301+8665 

-0.125581039^567707 
0.l2558l0391+56776ci 
0.37^762629301+8683 
0.6180339888307955 
0.851558583l88865£ 
1.07l653590OOi+651 
1.2 Jk 8k 79 79 5 36602 
1.1+57937251+87710£ 
1.618033988780795. 
I.752613360116257 
1.859552971803389 
1.937166322283083. 
1.981+2291+0265i+l1+)+  ' 


QL 


-1.071653590005390  (2 

-0. 8515585831 89561+9  (2 

-0.6l  8033988831 1+172  (2 

-0.371+762629305J+385  (2 

-0.12 5 5 810 39^ 572610  (2 

0.12558l0391+565282  (2 

0.371+ 762629  301+6693  (2 

0.61803398883061+99  (2 


0.8515585831888370 

(3 

1.071653590001+658 

(5 

1.271+81+7979536652 

(3 

l.I+5793725l+877l80 

(3 

1.618033988780873 

(3 

I.7526l3360ll6300 

(3 

1.859  5  52971 8031+21+ 

(3 

1.937166322283091 

(3 

I.98I+229I+0265I+158 

(1 
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TABLE  III.   Eigenvalues  Computed  by  (3.2)  -  (3.1*)  on  the  IBM  360/75  in 
Long  Precision 


For  x  =  1.0,  N  =  k: 

±1.90211   30325  9030? 
±1.17557  050^5  8U9U6 

For  x  =  1.0,    N  =  8; 

±2.12887  03310  0608I+ 

±1.82956  17932  5371+5 

±l.l+ll+21   35623  73095 

±1.05859  09306  37601 


For  x  =  10*"^,    N  =  12: 


±1.9^188  363^8  77852 
±1.77091  20513  3^653 
±1.1+9702  1U963  75601 
±1.13612  9^935  06320 
±0.70920  977^1  55572 
±0.2^107  33607  18052 


For  x  =  10~5,    N  =  2k: 


±1.981+22  91+026  51+15^ 
±1.93716  63222  83073 
±1.85955  29718  03390 
±1.7526l  33601  16255 
±1.6l803  39887-80796 
±1.1+5793  7251+8  77118 
±1.271+81+  79795  36599 
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TABLE  III  (continued) 

±1.07165  35900  OU650 

±0.85155  85831  88861 

±0.61803  39888  30797 

±0.37^76  26293  OU867 

±0.12558  1039^  56776 
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2.   ACCURACY 

Comparing  the  eigenvalues  listed  in  the  last  section,  we  find 
that  the  eigenvalues  obtained  by  the  multisection  method  agree  to  lk 
decimal  digits  with  those  listed  in  TABLE  III,  a  better  agreement  than 
those  of  the  QL  algorithm  [3,  tq£l]. 

There  are  some  cases  where  the  multisection  method  may  get 
into  trouble.   Consider  the  matrix  of  order  n  such  that 


d  =  2,  i  =  1,  2,  ...,  n, 

(3.5) 
e.  =  1,  i  =  1,  2,  ...,  n  -  1. 

The   eigenvalues   of  this  matrix  are 

k   Sin      ^2(n+l)]*    k  =  1'    2'    •'•'   n*  (3,6) 

If  n  is  large  the  smallest  eigenvalues  are  much  smaller  than  unity.   In 
computing  the  Sturm  sequence  for  any  value  of  y  we  have 

P.(y)  =  (y  -  d.)  P.(y)  -  e^  P^Cp)  (3.1) 

and  the   factor  y   -   d.    is   y  -  2«      Suppose  we  were  working  in  15-digit 

-9 

decimal   floating-point   arithmetic,    and  y   is  of  order  10        in  magnitude, 

then  for  y  -  2,  the  last  9  digits  of  y  are  completely  lost.  In  this 
case  we  may  not  obtain  many  significant  figures  for  the  small  eigen- 
values . 

Consider  the   following  two   graded  matrices  with   diagonal   elements 

12 
varying  from  1   to  10        as   example.      Matrix  X  is   defined  by 

d   ,    d    ,    . . . ,    d       =  1,    2      ,    . . .  ,   12      , 

12  12  (3.8) 

e.   =  1   for  i  =  1,    2,    . . .,   11. 
i 
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Matrix  Y  is   defined  by 

d   ,    d   ,    ...,    d       =  12      ,11      ,    ...,  1, 

1        2  12  (3.9) 

e.    =  1   for  i   =  1,    2,    . . .,    11. 

Both  matrices  should  have  the  same  eigenvalues. 

The  eigenvalues  of  X  and  Y  are  computed  by  using  Kahan  and  Varah's 
recursection  method  [l*+]  on  IBM  360/75  with  long  precision.   They  are  listed 
in  TABLE  IV.   The  eigenvalues  of  X  computed  by  both  the  multisection  method 
and  the  QL  algorithm  are  shown  in  TABLE  V.   Those  of  matrix  Y  are  listed 
in  TABLE  VI. 

For  the  QL  algorithm,  the  eigenvalues  of  X  computed  by  this  method 
have  very  high  accuracy.   But  the  eigenvalues,  especially  those  with  small 
magnitude  ,  of  Y  have  larger  relative  errors.   However,  for  the  multisection 
method,  the  eigenvalues  of  X  and  Y  are  almost  the  same  and  are  very  close  to 
those  shown  in  TABLE  IV. 
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TABLE  IV.      Eigenvalues   Computed  "by  Recursection  Method 


Eigenvalues  of  X 


Eigenvalues   of  Y 


0.99902  2U838  11381  +  0 

0.102U0   00960  28228  +  k 

0.590J+9  00001   622TT  +   5 

0.10U85   76000  00097  1Q   +   7 

0.97656  25000  00035  +   7 

0.60^66  17600  00008  +  8 

O.282U7   52U90  00017  +  9 

0.10737  U182U   00008  +  10 

0.3^86.7   8UU01   00013  1Q  +  10 

0.10000  00000   00006  +11 

0.25937  U21+60  1001U  +  11 

0.61917  36U22   1+00^9  0  +  11 


0.99902  2U838  11U02  +  0 

0.1021+0   00960   28229  +  k 

0.590^9  00001   62275  +  5 

0.10U85   76000  00095  +  7 

0.97656   25000   000U8  +  7 

O.60U66  17600  00029  +  8 

0.282*+ 7  52^90  00005  1Q  +  9 

0.10737  1+182*+  00002  +  10 

0.3^867  8^1+01   00012  +  10 

0.10000  00000  00005  10  +  11 

0.25937   U2U6O  10017  1Q   +  11 

0.61917   361+22   1+0033  0   +  11 


TABLE  V.      Computed  Eigenvalues   of  Matrix  X 


By  Multisection 

0.99902  21+838  11327  10+  0 

0.102>+0   00960  28223_  1Q  +  h 

0.  590^9  00001   62229_  10  +   5 

0.10i+85   76000    0009p_  1Q  +   7 

0.97656  25000  00011  1Q  +   7 

O.60I+66  17600  00000_  +  8 

0.2821+7  52U90  00001  +  9 

-  10 

0.10737  U182U  00000  +  10 


By   QL 


0.99902   21+838  113^1  +  0      (0)** 


10 


0.1021+0  OO96O  28221+ 

0.5901+9  00001  622U8 

0.101+85  76000  00093 

0.97656  25000  000U7 

0.60U66  17600  00009 

0.2821+7  521+90  00017 

0.10737  U182U  00001+ 


10 


10 


-     10 


-     10 


-     10 


— -     10 


-     10 


+  h  (1) 

+   5  (1) 

+  7  (1) 

+  7  (1) 

+   8  (1) 

+  9  (1) 

+  10  (1) 
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TABLE  V    (continued) 


O.3U867   81+1+00  99998       0  +  10 

0.10000  00000  00000  1_       +  11 

0.25937  1*21+60  10001  2_10  +  11 

0.61917  361+22  1+0002   h       +  11 


O.3I+867  81+1+01   00016   8  +10  (1) 

0.10000  00000  00007  9, 0  +11  (l) 

0.25937  1+21+60  10012_210  +  11  (l) 

0.61917  361+22   U0036_6  +  11  (1) 


**  This    column  gives  the  number  of  iterations   required  to   isolate  the 
corresponding  eigenvalue  by   TQL1 . 


TABLE  VI.      Computed  Eigenvalues  of  Matrix  Y 


By  Multisection 

0.99902  21+838  11316  +  0 

0.1021+0   00960   28223.  0  +  1+ 

0. 5901+9  00001  62229  10  +  5 

0.101+85   76000  00085  + 

—  10 

0.97656   25000   00011  +   7 

—  10 

0.601+66  17600   00002  +   8 

0.2821+7  521+90  00001_  +  9 

0.10737  1+1821+  00000_  +  10 

0.31+867  81+1+00  99998  +  10 

0.10000   00000   00000  +  11 

-  10 

0.25937  !+2l+60  10001  +   11 

0.61917  361+22  i+oooo_  +  11 


By  QL 


0.99902  25512 

37^30 

10 

+ 

0 

(2) 

0.1021+0  00960  22381+ 

10 

+ 

1+ 

(1) 

0.5901+9  00001 

61379 

10 

+ 

5 

(1) 

0.101+85  76000  00083 

10 

+ 

7 

(1) 

0.97656  25000 

00000 

10 

+ 

7 

(1) 

0.601+66  17600 

00000 

10 

+ 

8 

(1) 

0.2821+7  521+90 

00000_ 

10 

+ 

9 

(1) 

0.10737  1+1821+ 

00000_ 

10 

+ 

10 

(1) 

0. 31+867  8I+1+01 

00000 

10 

+ 

10 

(1) 

0.10000   00000 

ooooo_ 

10 

+ 

11 

(1) 

0.25937  1+21+60 

10000_ 

10 

+ 

11 

(1) 

0.61917  361+22  1+0000 

+ 

11 

(1) 
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3.   EXECUTION  TIME 

(l)   Complete  Eigenvalue  Problems 

For  the  matrix  B  in  (U.l),  we  test  those  with  x  =  1  and  dimension 
n  =  U,  6,  8,  12,  2U,  36.   The  execution  time  for  finding  all  the  eigenvalues 
of  each  matrix  by  using  both  the  multisection  method  and  the  QL  algorithm 
on  ILLIAC  IV  simulator  are  listed  in  TABLE  VII.   The  unit  time  is  millisecond 
(ms).   For  comparison,  we  plot  the  execution  time  versus  the  dimension  of 
matrix  for  both  methods  in  Figure  7. 


TABLE  VII.   Execution  Time 


Dimension  of 
Matrix  A 

Execution 

Time 

(in  ms ) 

Multisection 

QL 

U 

2.0 

3.6 

6 

U.8 

8.1+ 

8 

2.9 

i 

13.0 

12 

6.8 

28.0 

2\ 

13.0 

99.0 

36 

1+1.0 

220.0 

For  most  of  the  test  matrices  except  that  with  n  =  36,  all  the 
eigenvalues  are  separated  in  one  execution  of  the  Sturm  sequence  process, 
It  takes  7  executions  of  the  Sturm  sequence  process  (i.e.,  p  =  7)  to 
separate  the  eigenvalues  of  the  matrix  with  n  =  36.   The  Sturm  sequence 
process  actually  doesn't  consume  much  execution  time,  but  the  data  fetch- 
ing and  manipulations  during  this  process  causes  the  execution  time  to 
increase  obviously  as  p  increases.   This  explains  the  spike  in  the 


35 


-*-.. 


N* 


3 


H^j 


K 


o 
to 

-1 

-J 


M 
H 


c 


<3        - 


<3  - 


\5     \ 


—J L-**. 


1      i I I I       I      I       I 


J I L 


J L 


la 


c 

z: 

O 
H 
c/) 
Z 

Q 


O 

o 


o 
in 


X 


a 
o 


CO 

M 
EH 


W 


a 

w 
w 


Eh 
M 

PQ 


Eh 
K 

a  o 

O   ft, 


K 

o 

o 

<: 


a  a 
o  < 
w 

M 

« 

g 
2 
o 
o 


w 


o     u-i     o 


M 


EXECUTIOII  TIME 


36 


MULTTSEC  curve  at  n  =  36  in  Fig.  7«   If  the  p  remains  almost  constant  with 
respect  to  n,  we  can  expect  the  MULTISEC  curve  to  hehave  similarly  for 
n  >  N.   The  discrepancy  between  the  execution  times  in  Figs.  6  and  7  is 
caused  by  the  variation  in  the  number  of  interpolations  (i.e.,  q  in  Fig.  6) 
used  in  the  Zero-in  algorithm. 

With  x  =  10   in  the  matrix  B,  we  test  those  with  dimension 
n  =  k,    6,  8,  12,  2k.      The  execution  time  for  finding  all  the  eigenvalues 
of  each  matrix  by  using  both  the  multisection  method  and  the  QL  algorithm 
are  shown  in  Figure  8. 
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FIGURE  8.      COMPARISON  ON  THE  EXECUTION  TIME  BETWEEN  THE 
MULTISECTION  METHOD  AND   QL  ALGORITHM  FOR  MATRIX   B  WITH   x=10~5 
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(2 )   Partial  Eigenvalue  Problems 

For  partial  eigenvalue  problems,  we  have  the  following  results: 

(1)  Matrix  B  with  x  =  10~5,  n  =  12: 

(a)  Multisection  method: 

3.h  ms   to   compute   2  leading  eigenvalues, 
3.2  ms   to   compute  2   smallest   eigenvalues. 

(b)  Conjugate  Gradient  method   [l6]: 

23  ms   to   compute  min.    eigenvalue  with   eigenvector, 
16  ms  to   compute  max.    eigenvalue  with   eigenvector. 

(2)  Matrix  B  with  x  =  1.0,    n  =  2k: 
(a)     Multisection  method: 

5.7  ms   to   compute  5  leading  eigenvalues, 

U.8  ms  to   compute   5   eigenvalues   in  the   interval    (l.O,    1.5). 
(b  )      Conjugate   Gradient  method: 

57  ms   to   compute  min.    eigenvalue  with   eigenvector, 
k9  ms  to   compute  max.    eigenvalue  with  eigenvector. 

(3)  Matrix  B  with  x  =  1.0,   n  =   36: 
(a)     Multisection  method: 

lU  ms  to   compute   7  leading  eigenvalues, 

7.1  ms   to   compute   7  eigenvalues    in  the   interval    (-1.5,    -1.0). 
If  we  use  the  QL  algorithm  for  the   cases    (l),    (2)   and   (3)  we 
still  have  to   find  all  the   eigenvalues   and  then  select  the  required  eigen- 
values   from  them.      The  time   consumed  by  the  three  methods    is   shown   in 
Figure  9.      We   didn't  test   the  bisection  method,   but   from  the   time  estimate 
in  Section  2.5,   the  curve  of  the  bisection  method  should  be  between  the   QL 
algorithm  and  the  multisection  method. 
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FIGURE  9.      COMPARISON  ON  THE  EXECUTION  TIME 
FOR   PARTIAL  EIGENVALUE  PROBLEMS 
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(3)      Special   Cases 

The  multisection  method  is  not   always   superior  to  the   QL   algorithm 
in  execution  time.      For  example,    for  the   following  symmetric  tridiagonal 
matrix, 


subdiagonal 
+7. 83768800581+ 

-8.10502269777 


10 


-  1 


10   +  0 


-2.97985867006  -11 
-1.92U6678253910  +  0 
+8.6023252670U      +  0 


diagonal 
+1+.1+002127538710  +  0 


+3. 80 3U 76 81 6l 8       +  0 

+1.0796  3101+  30  3       +  1 

+1+.2567567567710  +  0 

+6.7l+32l+32l+32310  +  0 


+8.00000000000        +   0 


+0.00000000000 
the  eigenvalues    given  "by  Wilkinson    [3]    appear  as   very  close  pairs, 


-l.59873l+293601Q  +  0, 
+l+.l+559896381+910  +  0, 
+l.6ll+27l+l+655l10  +  l, 


-1.59873l+293l+61Q  +  0 
+1+.1+559896381+910  +  0 
+1.6ll+27l+l+655310  +  l, 


thus   leading  to   a  very  slow  process    in  separating  them  by  the  multisection 
method.      The  time   required  to   find  all  the  eigenvalues   is: 

Multisection  method:      10  ms 

QL  algorithm:      5.6  ms  . 
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IV.   CONCLUSIONS 

Generally,  the  QL  algorithm  is  an  efficient  method  for  computing 
the  eigenvalues  of  a  symmetric  tridiagonal  matrix  on  a  serial  computer. 
The  average  number  of  iterations  per  eigenvalue  is  generally  less 
than  2.   For  example,  for  the  graded  matrix  (3.8)  only  one  iteration  per 
eigenvalue  was  required. 

However,  to  implement  this  algorithm  on  a  parallel  machine,  we 
obtain  very  low  efficiency  because  only  two  or  three  PEs  are  kept  busy  at 
any  moment  during  the  process.   Besides,  we  have  no  choice  but  to  find 
all  the  eigenvalues,  this  would  be  uneconomical  for  very  large  matrices. 

Comparing  the  test  results  shown  in  the  last  chapter,  we  find  that 
the  multisection  method  is  much  more  favorable  than  the  QL  algorithm  in 
the  following  cases: 

(1)  When  some  selected  eigenvalues  are  required,  for  example 
the  largest  or  smallest,  or  the  eigenvalues  in  a  given  interval. 

We  have  tried  to  find  the  largest  eigenvalue  of  matrix  B  with  x  =  1.0  and 
n  =  2k   by  using  the  Conjugate  Gradient  method  implemented  in  parallel  on 
the  ILLIAC  IV  simulator,  it  required  1+9  milliseconds.   But  the  multisection 
method  finds  the  two  leading  eigenvalues  in  only  1+.8  milliseconds. 

(2)  When  we  are  interested  only  in  the  general  distribution  of  the 
eigenvalues  rather  than  their  accurate  determination.   In  this  case  the 
multisection  method  gives  the  result  very  fast. 

(3)  When  all  the  eigenvalues  are  well  separated,  we  may  find  that 
we  have  isolated  all  the  eigenvalues  after  one  step  of  multisection. 
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APPENDIX    I.       THE   ZERO- IN  ALGORITHM 

Here  we   describe  the   zero-in  algorithm  which   is    due  to  Van 
Wijngaarden,    Zonneveld,   Dijkstra  and  Dekker   [ll,   Appendix]. 

The   zero-in  algorithm  is   an  extension  of  the   secant  method. 
For  a  real   continuous    function   f(x)   and  two  given  values   a  and  b, 
f(a)f(b)    <   0,    the  secant   method  is    defined  by 


x     =  a. 

o 


Xj_   =  b,  (Al) 


x.._    =    (x.    f(x.       )   -  x.        f(x.))/(f(x.       )    -   f(x. )) 
1+1  1         1-1  1-1         1  1-1  1 

(i  =  1,  2,  ...).  (A2) 

When  this  method  converges  to  the  simple  zero  x  of  f(x)  we  have 

xi-xt  -  c(xi-i-xt)r-  (A3) 

where   r  =    (v5   +  l)/2    [10  pp.    100-101 ]    and  hence  the   convergence   rate   is 
better  than  the  usual   simple   iteration  method   [10,   p.    97]. 

However,   the   secant  method  may  sometimes   give  a  result  which    is 
outside  of  the   interval    (a,   b)    so   that   it   converges  to   another  root  or  even 
diverges.      To   avoid  this,    it   is   useful   to   combine  the  bisection  method  and 
the   secant   rule,    essentially  this   is  the   Zero-in  algorithm. 

It   is    described  as    follows:      For  the   given  interval    (b ,   c)  which 

contains   only  the   specified  zero   of  f(x),  we  have   f(b)f(c)    <   0.      Three 

points   a    ,   b      and  c      are   chosen   as   follows: 
o        o  o 

If    |f(b)|    <    |f(c)|    then  b     =  b,    c     =   c,    a     =c.  (Ak ) 

I  '   —  '  '  o  o  o         o 

If    |f(b)|    >    |f(c)|    then  b     =   c,    c     =  b,    a     =   c   .  (A5) 

II  '  '  o  o  '     o         o 

Then,    at  the  beginning  of  the  i-th  stage,   the  three  points   a.,   b.    and  c. 
are   involved  and  such  that 


h5 
f(b.)f(c.)   <   0,    |f(b.)|    <_  |f(c.)|.  (A6) 

The  process    in  this   stage  is  then  as   follows: 

(1 )  Interpolate  between   a.    and  b.   by    (A2)   to   get   P.. 

(2)  Determine  the  mid-point  mJ    of  b.    and  c.. 

ill 

(3)  If  P.    is  between  b.    and  m. ,   then  it   is   accepted  as  b.    .,  .      Other- 

1  11  l+l 

wise  m.    as   accepted  as   b. ,_  . 
1  l+l 

( k )      Take  a . , n    =  b .    and  c . , ,    =   c . . 
l+l  1  l+l  1 

(5)  If  b  and  c  satisfy  the   conditions 
f(b.+l)f(c.+1)    <   0   and    |f(b.+1)|    <    |f(c±+;L)| 

then   go  back  to    (l)    for  the  next   stage,   otherwise  go  to    (6)   to   adjust  the 

values   of  a.    ,  ,   b.    ,    and  c. 

1+1'      i+1  i+1 

(6)  If  f(b.    , )f(c.      )   >  0  then  we  take  c         =  b. ;   this  will   ensure 
that    f(b.+1)f(ci+1)   <  0. 

(7)  If    I  f  (b.  ,_  )  I    >    If  (c.  ,_  )  I    then  we   interchange  b.  ,_    and  c.  ...    and 

1        1+1    '         '        1+1    '  1+1  1+1 

take  a.    .    to  be  the  value  of  new  c.  ,_  .      The   right   conditions  now  hold  for 
l+l  l+l 

the  beginning  of  next   stage.      Of  course,   once  we   find  b.    in   each   stage,   we 
check  to   see  whether  to   accept   it   as   a  reasonable  root  of  the   function  f(x) 
The  stopping  criterion   for  the   zero-in  algorithm  seems  to  be  a 

bit    involved.      To  use  the   criterion    lb.    -  P. I    <   e   is   unreliable.      For 

I  1  1 ' 

example   in  the   case  of  Fig.   Al   if    |f(b    )|«|f(c    )|   the  quantity    lb     -  P 
r  '        o    '  .    ■        o    '  '   o         o ' 

will  be  very   small   although  neither  b     nor  P     is  near  the  required  zero. 

o  o 

Since  the   required  zero   is  between  b.    and  c..    to  use    |b.    -   c.      as  the 

II  '    1  1 ' 

stopping  criterion  seems   all   right,   but   unfortunately  it  may  never  be 
satisfied.      See  Fig.   A2. 
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To  overcome  the   difficulty  described  above,  Wijngaarden  et.    al. 
[11,   Appendix]    suggests   a  simple  stratagem.      Suppose  the  stopping  criterion 
is    lb.    -   c.  I    <  toil.      Then  if    |P.    -  b.  I    <  toil,   then  P.    is   replaced  by 
P.    +  sign    (c.    -  b.  )   x  toil,   this  will   ensure  that  a  b  is    finally  beyond 

the   required  zero   and  makes    f (b...  )f  (c._,n  )    >  0.      Then   c.in    is   switched  as 

*  l+l    l+l  l+l 

mentioned  in  process  (7)  and  this  immediately  gives  a  b.,_,  and  c.  ,  con- 

l+l  l+l 

taining  the  zero   and  with    lb.  ,_    -   c.  ,_  I    <  toi.      The   choice  of  toil  is 

1  l+l    l+l ' 

experiemental  and  depends  on  the  machine  used. 
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APPENDIX  II.   GLYPNIR  PROGRAMS 

The  ILLIAC  IV  GLYPNIR  program,  "MULTISEC"  is  listed  below.   It 
contains  two  major  subroutines:   "STURMPROC"  and  "ZEROIN".   "STURMPROC" 
is  used  to  obtain  the  intervals  such  that  each  of  them  contains  exactly- 
one  of  the  eigenvalues  of  a  symmetric  tridiagonal  matrix  A  in  a  given 
interval.   If  all  the  eigenvalues  are  required  the  interval  is 
(-||A|  ^j  MaII^).   The  "ZEROIN"  routine  is  used  to  compute  the  eigen- 
values contained  in  the  intervals  given  by  "STURMPROC".   Switch  "CHOICE" 
is  given  to  provide  the  choice  of  using  either  one  or  both  of  "STURMPROC" 
and  "ZEROIN".   If  only  "ZEROIN"  is  used,  the  used  should  give  the  upper 
and  lower  bounds  of  each  interval  containing  only  one  eigenvalue. 
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SUBROUTINE  MULTISEC  (CINT  N.CREAL  EPS.PCPOINT  OtPCPOINT  E.PCPOINT  EIGL» 
PCPOINT  EIGU.CINT  ISW.CREAL  HI.CREAL  LO.CINT  CHOICE. 
CINT  MITER) t 
BEGIN 
CINT   NM.MAXNU  LABEL  CH1.CH2I  PREAL  A.BI   PE  REAL  VECTOR  Pllll 

*  THIS  SUBROUTINE  CONSISTS  OF  TWO  MAJOR  SUBROUTINES:  STURMPROC  i  ZEROIN 
%  STURMPROC   FINDS  ALL  THE  INTERVALS  SUCH  THAT  EACH  OF  THEM  CONTAINS 

%  EXACTLY  ONE  EIGENVALUE.   ZEROIN   FINDS  ALL  THE  EIGENVALUES  CONTAINED 

*  IN  THE  INTERVALS  FOUND  BY  STURMPROC  OR  BY  GIVEN.    EITHER  ONE  OR  BOTH 

*  OF  THEM  CAN  BE  CALLED  BY  USING  SWITCH:   CHOICE. 

*  CHOICE=l  MEANS  CALL  STURMPROC  ONLY. 

*  CHOICE=2  MEANS  CALL  ZEROIN  ONLY. 
%     CHOICE=0  MEANS  CALL  BOTH. 

*  ISW=0  MEANS  COMPLETE  EIGENVALUE  PROBLEM.  OTHERWISE  MEANS  PARTIAL 

*  EIGENVALUE  PROBLEM. 

*  MITER:  THE  MAX  NUMBER  OF  ITERATIONS  IN  ZERO-IN   ALGORITHM. 


SUBROUTINE  STURM»ROC» 
BEGIN 

CU  REAL  NORM,H.TMP»TMPl» 

CU  INTEGER  I . JtKtL tVARUP.G.S: 

CU  REAL  VECTOR  MOOtNOMOU 

PE  REAL  PEL.TE«P| 

PE    INTEGER    AGR. VAR.NN.LLMT.KSW.FINISH.N1 : 

PE    REAL    VECTOR    JP.LWllOU 

PE    INTEGER    VECTOR    VUPI10H 

LABFL      GOUT.RPT.riNSJ 

% 

SUBPOUTINE  STU3MI 

*  FIND  THE  AGREEMENT  OF  SIGNS  IN  STURM  SEQUENCE 

BEGIN 

CU  REAL  OD.P3! 
PE  REAL  oi.Ql.Rll 
LABEL  ENDS! 
IF  G=0  THEN 
BEGIN 
AGR:=0; 
TEMP:=D(0J: 
DD:=GRABONE (TEMP.O) » 
Rl  :=DD-PELJ 
Pi:=i:   Oi:=Rl* 
IF  Ql  GEO  0  THEN  AGR:=AGR*H 
LOOP  Lt=l.l»N-l  DO 
BEGIN 

*    GRAB    DID    AND    ESQIL-l  ) 
J:=L   DIV    64: 
TEMP:=Dt  JH 
0O:=GRAB3NE<TEMP,L) » 
J:=    (L-l)    DIV    64: 
TEMP:*P{ J): 

pp:=GRAB3NE(TEMP»L-1) I 
% 

Ri:=<DD-°EL)*Ql-PP#Pl» 
Pi: =01:      Oi:=Ri: 

If    PI  GEO  0  EQV  Ql  GEO  0  THEN  AGR:=AGR*1 
END!      *  END  OF  LOOP  L 

IF  Q1=0  AND  P1>0  THEN  AGR:=AGR-1 
END       ELSE 
BEGIN 


00000100???? 
00000200???? 
00000300???? 
00000400???? 
00000500???? 
.00000600???? 
00000700???? 
00000800???? 
00000900???? 
00001000???? 
0000110077?? 
00001200???? 
00001300???? 

oooouoo???? 

00001500???? 
00001600???? 
00001700???? 
00001800???? 
00001900???? 
00002000???? 
00002100???? 
00002200???? 
00002300???? 
00002400???? 
00002500???? 
00002600???? 
00002700???? 
00002800???? 
00002900???? 
00003000???? 
00003100???? 
00003200???? 
00003300???? 
00003400???? 
00003500???? 
0P003600???? 
00003700???? 
00003800???? 
00003900???? 
00004000???? 
00004100???? 
00004200???? 
00004300???? 
00004400???? 
00004500???? 
00004600???? 
00004700???? 
00004800???? 
00004900???? 
00005000???? 
00005100???? 
00005200???? 
00005300???? 
00005400???? 
00005500???? 
00005600???? 
00005700???? 
00005800???? 
00005900???? 
00006000???? 
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iABS(TMP)/EPS; 


TEMP:  =01  0U 

DO  t  =GRABONE ( TEMP .0)1 

Qi:=DD-PEL»     AGR:=OI 

IF  01  GEO  0  THEN  AGR:*AGR*1» 

LOOP  L:»1»1»N-1  00 

BEGIN 

J:=L  OIV  641 

TEMPtxDf J)» 

OD:=GRABONE(TEMP.L) » 

J:=(L-1)  OIV  64» 

TEMP:=P( J)l     PP:=GRABONE(TEMPtL-l) J 

TEMP:=EIJ)»     TMPtsGRABONE(TEMPtL-l) « 

IF  Ql  NEO  0  THEN  TEMP:=PP/Q1  ELSE  TEMPI. 

* 

Qi:=DO-PEL-TEMP; 

IF  01  GEO  0  THEN  AGR:=AGR*1 

END  J      *  END  OF  LOOP  L 

IF  01=0  THEN  AGR:=AGR-1 
FNDl 
ENDS: 

END;     *  ENO  OF    SUBROUTINE  STURM 
SUBROUTINE  IDENTIFY! 
BEGIN 

mooe:=truej 

m00e:=pen<63  and  true* 

var:=agr-rtl(1.»agr) ; 

mode:=pen=63  anc  true; 

if  isw=1  then  begin  isw :=isw-1 i  var:=0  end  else  var : =agr-varupl 

mode:=true» 

for  all  var>1  do 

BEGIN 

NN:=NN*1 » 
LW(NNl:=PEL? 

up[nni:=pel*h; 

VUP(NNl:=A3R-VAR{ 
FINISH:=0» 

end; 

for  all  var=1  do- 

BFGIN 

Nl :=Nl*lt 
EIGLJN1 ]:=»EL; 
EIGUf Nl J:==>EL*H» 

fnD? 
enoj    *  end  df  identify 

* 

*  PREPARE  FOR  MAIN  PROG 
%  FIND  THE  NORM  OF  MATRIX 

K:=  (N-l)  DIV  641 

IF  K>0  THEN 

LOOP  I  :  =0  •  1  •  <- 1  DO  M0DU):=6<V; 

M0D(K]:=N-K«64I 

loop  i  :=0» 1 »  <  00  begin 
mode:=true; 

m0de:=pen<m0d( ii   and  pen>0  and  true* 
r{ i  i:=abs(dl i 1)+abs(e( 1 1 ) *abs (rtr ( 1 . .ei  i )) )  end  i 

mooet=true»     modf»=pen=0  and  true* 

PfOi:=ABS<D[0])*ABS(E[0)>  5 
LOOP  i:=l.l.<  00 

P[ I  ):=ABS(Dl I1)*ABS(EI I1)*ABS(RTR(1.»EII-1)))  » 
LOOP  I:=0»1»K  DO  BEGIN 


00006100???? 
00006300???? 
00006300???? 
00006400???? 
00006500???? 
00006600???? 
00006700???? 
00006800???? 
00006900???? 
00007000???? 
00007100???? 
00007200???? 
00007300???? 
00007400???? 
00007500???? 
00007600???? 
00007700???? 
00007800???? 
00007900???? 
00008000???? 
00008100???? 
00008200???? 
00008300???? 
00008400???? 
00008500???? 
00008600???? 
00008700???? 
00008800???? 
00008900???? 
00009000???? 
00009100???? 
00009200???? 
00009300???? 
00009400???? 
00009500???? 
00009600???? 
00009700???? 
00009800???? 
00009900???? 
00010000???? 
00010100???? 
00010200???? 
00010300???? 
00010400???? 
00010500???? 
00010600???? 
00010700???? 
00010800???? 
00010900???? 
00011000???? 
00011100???? 
00011200???? 
00011300???? 
00011400???? 
00011500???? 
00011600???? 
00011700???? 
00011800???? 
00011900???? 
00012000???? 
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AND  TRUE! 


NORM:=NOM[ I )t 
IS".  1  WORD  REAL: 


(NORM) ) ; 


*  FIND  B[I)=Em»EU]  TO  BE  USED  IN  STURM 
=TRUE5 
i:=0.1.K  DO 


MODE:=TRUE»   MOOE:=PEN<MODlKJ 

NOMl I J:=MAX(PI II)   ENOI 
NORm:=NOM(0II 
LOOP  i:=l?l*<<  DO 

IF  NOMl  I  )>NORM  THEN 
SIMWRITE  (LI^E."  NORM 
* 

*  SEARCH  FOR  VERY  SMALL  EI  I) 
LOOP  i:=0.1»K  DO 
BEGIN 

mode:=truE;  mode:=pen<mod( I l  and  true; 

ksw:=o; 

Temp:=EC  II: 

IF  ARS  (TEMP)  /NORM  <  1.0(i-6   THEN  KSW:=ll 

G:=MAX(KSW) I 

IF  6=1  THEN  50  TO  GOUT; 
END: 
GOUT 
MODE 
LOOP 
BEGIN 

IF  Ell  1  =  0  THEN  P( I J:=EPS*NORM«NORM 
ELSE  P(I1:=E(I1*E[I1 
END; 
LO0D  l:=0«lfK  DO   BEGIN 

EI6L(Il:=O.OI   EIGUfI]:=0.0   ENDS 
LOOP  It=0»ltlO  DO  BEGIN 

Lwr I  )  :=0.0 t   UP(I):=0.0» 

*  BFGIN  MAIN  P30G 

*  DISTRUBUTE  LAMBDA  TO  ALL 
IF  ISW  =  0  THEN  3EGIN 
H:aN0PM/32;      PEL:=-NORM*PEN«H| 

ENO  ELSE  BEGIN  H: = (HI-LO) /63I   PEL :=LO*PEN«H  END! 
VARUP:=0»   NN:=-ll   Ni:=-i:   LLMT:=-1»   finISh:»0; 
STURM;   IDFNTIfY; 
RPT:   MODE:=TRUE« 

IF  LLMT  GEO  NN  THEN  F IN  I SH : =- (PEN* 1 )  ELSE  FINISH:=PEN; 

S:-MAX (FINISH) j 

ir  S<0  THEN  30  TO  FINS; 

MO0F:=TRUE;     MODE:=PEN=S  AND  TRUE; 

LLMT:=LLMT*1 : 

varup:=vupillmt j ; 

tmp:=lw(LLMT ) ; 

H:=(UP[LLMT)-LW[LLMT  l)/64? 

IF    h<E3S  THEN  BEGIN 

TMP:=LW(LLMTH     TMP1 :=UP t LLMT )» 
SIMWRITE  (LINE."   SOME  E-VALUES  ARE 
SIMWRITE  <UNE." 
1  WORD  REAL: 
GO  TO  RPT    END; 
mode:=TRUEi    pel:=Tmp*pen«h; 
sturm;   identify; 
GO  TO  rpt; 

%  PRINT  all   intervals  which  contain 
fins:      M0DE:=T9UE;  k:=max<nim 

SIMWRITE   (LINE."  INTERVALS  CONTAIN 


vup(ii:=o  end; 


PES 


TOO  CLOSE  TO  IDENTIFY"); 
THEY  ARE  IN  THE  INTERVAL:.  LOWER". 
<TMP)»"  UPPER".  1  WORD  REAL:  (TMP1))I 


SINGLE  E-VALUE 
1  E-VALUE  ARE") ; 


LOOP  i:=o 
SIMWRITE 
LOOD  i:=0 
SIMWRITE 


l.K  DO 
(LINE." 
l.K  DO 
(LINE." 


EIGL  "  .EIGLII I) ; 
EIGU  "  .EIGUI  II)   ; 


00012100???? 
00012200???? 
00012300???? 
00012400???? 
00012500???? 
00012600???? 
00012700???? 
00012800???? 
00012900???? 
00013000???? 
00013100???? 
00013200???? 
00013300???? 
00013400???? 
00013500???? 
00013600???? 
00013700???? 
00013800???? 
00013900???? 
00014000???? 
00014100???? 
00014200???? 
00014300???? 
00014400???? 
00014500???? 
00014600???? 
00014700???? 
00014800???? 
00014900???? 
00015000???? 
00015100???? 
00015200???? 
00015300???? 
00015400???? 
00015500???? 
00015600???? 
00015700???? 
00015800???? 
00015900???? 
00016000???? 
00016100???? 
00016200???? 
00016300???? 
00016400???? 
00016500???? 
00016600???? 
00016700???? 
00016800???? 
00016900???? 
00017000???? 
00017100???? 
00017200???? 
00017300???? 
00017400???? 
00017500???? 
00017600???? 
00017700???? 
00017800???? 
00017900???? 
00018000???? 
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MAXNl:=Kt 
ENO«     *  END 


STURMPROC 


SUBROUTINE  ZEROntPREAL  A»  PREAL  B)  1 
BEGIN 

CREAL  RTOL; 

CINT  ItJtKt 

CU  REAL  VECTOR  MODI  31 t 

PREAL  C»FA.FB»rCtINTtMID»TOL»TEMP* 

PINT  ITRt 

PE  REAL  VECTOR  DD»PP(12}t 

LABFL  ROOTt 

BOOLEAN  SAVEMOOEl 

% 

SUBROUTINE   FUM(PREAL  X. PREAL  OUT  FX) I 

BEGIN 

PPEAL  «»SI   CINT  It 

R:=i:   S:=OD(0J-Xt 

LOOP  l:=l»l»M-l  DO 

BEGIN 

Fx:  =  (nom-x)*s-ppti-ii«Rt 

R:=St   S:=fxt 

ENDt 
ENOt   *  END  OF   SUBR   FUN 
*  MOVE  THE  MATRIX  INTO  EACH  PE 
K:=(N-1)  DIV  6^t 
IF  too  THEN 

LOO"  I:=0»1»K-1  DO  M0D(I):=63t 
MODf K ):=N-K»64-lt 
MODE:=TRUEt 
LOOP  i:=0.1.K  DO 
BEGIN 

TFMD:=Df I  )  t 

LOOP  j:*0.1»*ODl I  1  DO 

DO! J>64«I ] :=3RABONE (TEMPtJ) t 

TEMP:=P( I]t 

LOGO  j:=0»l«MODl  I  1  DO 

PP(  J*64*I  j:=r5RABONE(TEMP,J)  t 
END;     %  END  LOOP  I 
%  BEGIN  THE  PROCESS  OF  ZEROIN 
ITR:=0t 
RTOL  :=4*EPSt 
MODE:=TRUES 

MODE:=A  NEQ  0  OR   B  NEQ  0  AND  TRUEt 
FUN(A.FA) ? 
FUN(B»FB) « 
C:=A»   FC:=FAt 
IF  ABS(F8)=ABS(FC)  THEN 
BEGIN 

A:=Bt   FA:=F9t   BS=Ct   FB:=FCt 

C:=At   FC:=FA» 
ENOt 

MID:=(B*C)/2t 
TOL:=RTOL«ABS(3)*EPSt 
% 

H(HILE  ABS(MID-9)>TOL  AND  ITR<MITER  DO 
BEGIN 
IF  FA  NEQ  FB  T^N  INT  :  =  ( A»FB-B»FA)  /  (FB-FA) 


SAVEMODE:»MODEt 
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IF  A 

A:=e 
IF  S 

FUN< 
MODE 
FOR 
BEGI 
IF  A 
BFGI 
a: 
C: 
end: 
mid: 
Tol: 
ITR: 
END: 
* 
ROOT. 
5IMW 
SIMW 

end: 
if 

IF 

CH2:    M 

CHi:    s 

end: 


ELSE    INT:=MID; 
5SUNT-BXT0L    THEN    INT  J=B*SI6N  (C-B)  *TOLI 
;       fa:=fbi 
IGN(INT-MID)=SIGN<B-INT)     THEN    Bt=INT 

ELSE    B:=MIDI 
9.FB)  \ 

:=ABS(FB)>4*N«>EPS  AND  MODE! 
ALL  SIGN(FC)=SIGN(F9)  DO 
N   C:=A;   rC:=FA   ENOI 
BS(f"B)=ABS(FC>  THEN 
N 

=H»   fa:=f=u   B:=Ct   FB:=FC* 
=a»   fc:=Fa; 

=<b*C>/2; 

=RTOL»ABS<3>*EPSt 
=ITR*i: 


mode:=savemode; 

RITE  (LlNEt"  EIGENVALUES  ARE'SBM 
RITE  (LlNE^'ITERs'S  ITR)   I 
*  END  Of  ZEROIN 

CHOICE=2  THEN  GO  TO  CH21     STURMPROCI 

CHOICE=l  THEN  GO  TO  CH1I 
nDE:=TRUE:   LOOP  NM:=0»ltMAXNl  DO 

3EGIN    A:=EIGLINM]J       B:=EIGU(NMH       ZEROINU.BM       ENDl 
IMWRITE     (LINE. "THIS    IS    THE    END    OF    PROGRAM") 

%   END  OF  ^JLTISEC  ••*••»••**••••••♦••••••»*»••»•»*•••• 
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